Bar Charts and Histograms
October 27, 2023
The Grammar of Graphics
- Data viz has a language with its own grammar
- Basic components include:
- Data we are trying to visualize
- Aesthetics (dimensions)
- Geom (e.g. bar, line, scatter plot)
- Color scales
- Themes
- Annotations
Let’s start with the first two, the data and the aesthetic.
library(readr)
library(ggplot2)
dem_summary <- read_csv("data/dem_summary.csv")
ggplot(dem_summary, aes(x = region, y = polyarchy))
Gives us the axes without any visualization.
Now let’s add a geom. In this case we want a bar chart so we add geom_col()
ggplot(dem_summary, aes(x = region, y = polyarchy)) +
geom_col()
OK but that gets the idea across but looks a little depressing so…
…let’s change the color of the bars by specifying fill = "steelblue"
.
ggplot(dem_summary, aes(x = region, y = polyarchy)) +
geom_col(fill = "steelblue")
Note how color of original bars is simply overwritten.
Now let’s add some labels with the labs()
function:
ggplot(dem_summary, aes(x = region, y = polyarchy)) +
geom_col(fill = "steelblue") +
labs(
x = "Region",
y = "Avg. Polyarchy Score",
title = "Democracy by region, 1990 - present",
caption = "Source: V-Dem Institute"
)
And that gives us…
Next, we reorder the bars with the reorder()
function.
ggplot(dem_summary, aes(x = reorder(region, -polyarchy), y = polyarchy)) +
geom_col(fill = "steelblue") +
labs(
x = "Region",
y = "Avg. Polyarchy Score",
title = "Democracy by region, 1990 - present",
caption = "Source: V-Dem Institute"
)
This way, we get a nice, visually appealing ordering of the bars according to levels of democracy…
Jumping ahead a little, let’s change the theme to theme_minimal()
.
ggplot(dem_summary, aes(x = reorder(region, -polyarchy), y = polyarchy)) +
geom_col(fill = "steelblue") +
labs(
x = "Region",
y = "Avg. Polyarchy Score",
title = "Democracy by region, 1990 - present",
caption = "Source: V-Dem Institute"
) + theme_minimal()
Gives us a clean, elegant look.
Note that you can also save your plot as an object to modify later.
dem_bar_chart <- ggplot(dem_summary, aes(x = reorder(region, -polyarchy), y = polyarchy)) +
geom_col(fill = "steelblue")
Which gives us…
Now let’s add our labels.
dem_bar_chart <- dem_bar_chart +
labs(
x = "Region",
y = "Avg. Polyarchy Score",
title = "Democracy by region, 1990 - present",
caption = "Source: V-Dem Institute"
)
So now we have…
Now let’s add our theme.
dem_bar_chart <- dem_bar_chart + theme_minimal()
Voila!
Change the theme.
dem_bar_chart + theme_bw()
Your Turn!
Use your own wrangled data or this code to get started:
library(readr)
library(ggplot2)
dem_summary <- read_csv("https://raw.githubusercontent.com/eteitelbaum/dataviz-fall-2023/main/modules/data/dem_summary.csv")
glimpse()
the data
- Find a new variable to visualize
- Make a bar chart with it
- Change the color of the bars
- Order the bars
- Add labels
- Add a theme
- Try saving your plot as an object
- Then change the labels and/or theme
Now Try a Histogram
Use this code or your own wrangled data to get started
dem_women <- read_csv("https://raw.githubusercontent.com/eteitelbaum/dataviz-fall-2023/main/modules/data/dem_women.csv")
- Pick a variable that you want to explore the distribution of
- Make a histogram
- Only specify
x =
in aes()
- Specify geom as
geom_histogram
- Choose color for bars
- Choose appropriate labels
- Add a theme